Structural disambiguation of morpho-syntactic categorial parsing for Korean
نویسندگان
چکیده
The Korean Combinatory Categorial Grammar (KCCG) tbrmalism can unitbrmly handle word order variation among arguments and adjuncts within a clause as well as in complex clauses and across clause boundaries, i.e., long distance scrambling. Ill this paper, incremental parsing technique of a morpheme graph is developed using the KCCG. We present techniques for choosing the most plausible parse tree using lexical information such as category merge probability, head-head co-occurrence heuristic, and the heuristic based on the coverage of subtrees. The performance results for various models for choosing the most plausible parse tree are compared. 1 I n t r o d u c t i o n Korean is a non-configurational, t)ostpositional, agglutinative language. Postpositions, such as noun-endings, verb-endings, and prefinal verbendings, are morphemes that determine the fnnctional role of NPs (noun phrases) and VPs (verb phrases) in sentences and also transform VPs into NPs or APs (adjective phrases). Since a sequence of prefinal verb-endings, auxiliary verbs and verb-endings can generate hundreds of different usages of the same verb, morphemebased grammar modeling is considered as a natural consequence for Korean. There have been various researches to disambiguate the structural ambiguities in parsing. Lexical and contextual information has been shown to be most crucial for many parsing decisions, such as prepositional-phrase attachment (Hindle and Rooth, 1993). (Charniak, 1995; Collins, 1996) use the lexical intbrmation * Th i s research was par t ia l ly s u p p o r t e d by K O S E F special bas ic resem'ch 1)rogram (1997.9 ~ 2000.8). and (Magerman and Marcus, 1991; Magerman and Weir, 1992) use the contextual information for struct;nral disambiguation. But, there have been few researches that used probability intbrmarion for reducing the spurious ambiguities in choosing the most plausible parse tree of CCG formalism, especially for morpho-syntactic parsing of agglutinative language. In this paper, we describe the probabilistic nmthod (e.g., category merge probability, headhead co-occurrence, coverage heuristics) to reduce the spurious atnbiguities and choose the most plausible parse tree for agglutinative languages such as Korean. 2 O v e r v i e w o f K C C G This section briefly reviews the basic KCCG formalism. Following (Steedman, 1985), order-preserving type-raising rules are used to convert nouns in grammar into the functors over a verb. The following rules are obligatorily activated during parsing when case-marking morphemes attach t o n o r a 1 s t e m s . • Type Raising Rules: np + case-marker feature]) v/(v\np[caseThis rule indicates that a noun in the presence of a case morpheme becomes a functor looking for a verb on its right; this verb is also a flmctor looking for the original noun with the appropriate case on its left. Alter tile noun functor combines with the appropriate verb, the result is a flmctor, which is looking for the remaining arguments of the verb. 'v ' is a w~riable tbr a verb phrase at ally level, e.g., the verb of a matrix clause or the verb of an embedded clause. And 'v' is matched to all of
منابع مشابه
Morpho-syntactic Modeling of Korean with a Categorial Grammar
A morpho-syntactic categorial modeling for Korean is introduced. Variable categories and notations for word order treatment are newly invented and notations for elliptic arguments are also suggested. Incremental parsing technique of a morpheme graph is newly developed using the proposed categorial Korean modeling. Two heuristics based on the coverage of sub-trees and the part-of-speech bigrams ...
متن کاملKorean Parsing in an Extended Categorial Grammar
This paper gives an automatic morpho-syntactical analysis with the ACCG parser which use the Categorial Grammar, the Combinatory Logic in the framework of Cognitive and Applicative Grammar. We focus on the contribution of the parser to the analysis of morphological case system in Korean.
متن کاملSyntactic Disambiguation by Using Categorial Parsing in a DOOD Framework
We present a natural language interface for Japanese that relies on semantically driven parsing in that it applies syntactic analysis only if necessary for disambiguation. For this purpose we utilize a categorial parser which also analyzes incomplete or ungrammatical input e ciently. The complete linguistic analysis is performed by means of deductive object-oriented database (DOOD) technology s...
متن کاملMorphological and Syntactic Case in Statistical Dependency Parsing
Most morphologically rich languages with free word order use case systems to mark the grammatical function of nominal elements, especially for the core argument functions of a verb. The standard pipeline approach in syntactic dependency parsing assumes a complete disambiguation of morphological (case) information prior to automatic syntactic analysis. Parsing experiments on Czech, German, and H...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000